hacker code cracker slacker

What do you do, when you have a project that requires a specific tool, and the tool (from your knowledge) doesn’t exist?

YOU BUILD IT!

And that’s what I did. I had a need to transcribe several hundred audio files, but didn’t want to use OpenAI’s API (money I don’t want to spend). I looked around Github, Docker Hub, and didn’t find what I was looking for. Which is – The ability to transcribe hundreds of audio files with no user interaction. Now, if i’m going to build my own tool, then I want my tool to be as OS agnostic as possible. Which means it needs to be able to run on:

  • Microsoft Windows
  • Linux (Ubuntu, Pop-OS, WaTT OS,)
  • MacOS

By the title of my post, you probably can figure out what I decided to go with. Yep, Docker. Docker has many advantages with the top one (in my book) being OS agnostic. So, without further ado, lets get started!

Prerequisites

For this post, I’m going to be using Linux, specifically, Ubuntu as my OS for building this container. Most, if not all, of the steps outlined here will be specific to Linux. If you don’t have Docker installed, go install it along with Docker Compose, and meet me back here.OH and yeah, your gonna need an nVidia GPU for this since too.

Step 1: Set Up Directory Structure.

To begin, You’ll need to organize your project directory. Here’s the directory structure we’ll use:

/home/username/docker/whisper/
│
├── whisper-transcription/
│   ├── Dockerfile
│   ├── transcribe.sh
│   ├── docker-compose.yml
│
├── audio-files/
│   ├── file1.mp3
│   ├── file2.wav
│   ├── subdir/
│   │   ├── file3.m4a
│   │   └── file4.mp3
│   └── file5.wav
│
└── transcriptions/
mkdir -p /home/username/docker/whisper/whisper-transcription
mkdir -p /home/username/docker/whisper/audio-files
mkdir -p /home/username/docker/whisper/transcriptions

Step 2: Create the Dockerfile

The Dockerfile defines the environment in which Whisper will run. Here’s how to create it:

# Dockerfile
FROM nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu22.04

# Set environment variables for Python
ENV DEBIAN_FRONTEND=noninteractive
ENV PYTHONUNBUFFERED=1
ENV PYTHONDONTWRITEBYTECODE=1

# Install Python, ffmpeg, and other dependencies
RUN apt-get update && apt-get install -y \
    python3-pip \
    python3-dev \
    ffmpeg \
    && rm -rf /var/lib/apt/lists/*

# Install Whisper and its dependencies
RUN pip install --upgrade pip
RUN pip install openai-whisper

WORKDIR /app

COPY transcribe.sh /app/transcribe.sh

RUN chmod +x /app/transcribe.sh

ENTRYPOINT ["/app/transcribe.sh"]

Step 3: Create the Transcription Script

This Script will instruct Whisper on how to process the audio files:

#!/bin/bash
# transcribe.sh

# Function to transcribe a single file
transcribe_file() {
    local file=$1
    echo "Transcribing $file"
    whisper "$file" --output_dir transcriptions
}

# Export the function so it can be used with find
export -f transcribe_file

# Find and transcribe all audio files recursively
find /app/audio -type f \( -name "*.mp3" -o -name "*.wav" -o -name "*.m4a" \) -exec bash -c 'transcribe_file "$0"' {} \;

Make the script executable

chmod +x /home/username/docker/whisper/whisper-transcription/transcribe.sh

Step 4: Create the Docker Compose File

The docker-compose.yml file will manage the services and configuration:

# docker-compose.yml

services:
  whisper-transcriber:
    build: .
    volumes:
      - ~/docker/whisper/audio-files:/app/audio
      - ~/docker/whisper/transcriptions:/app/transcriptions
    deploy:
      resources:
        reservations:
          devices:
            - capabilities: [gpu]
    runtime: nvidia

Step 5: Build and Run the Docker Image

Navigate to the directory and build the Docker image:

cd /home/username/docker/whisper/whisper-transcription
docker-compose build

Place your audio files in the ‘audio-files’ directory, then run the transcription service:

docker compose up

You can monitor the process in real-time. Once the transcription is complete, stop the service:

docker compose stop

If you encounter any issues, you can remove the container and start fresh:

docker compose down

Final Thoughts

Yes, I know what your thinking. “Why didn’t you leave it as a docker run command?” Good question! At first, I started to do so. But I’m a big believer in the ‘Document As Code’ philosophy and to a greater degree docker compose allows me to do this.

Related posts